129 research outputs found

    Towards Robust Curve Text Detection with Conditional Spatial Expansion

    Full text link
    It is challenging to detect curve texts due to their irregular shapes and varying sizes. In this paper, we first investigate the deficiency of the existing curve detection methods and then propose a novel Conditional Spatial Expansion (CSE) mechanism to improve the performance of curve text detection. Instead of regarding the curve text detection as a polygon regression or a segmentation problem, we treat it as a region expansion process. Our CSE starts with a seed arbitrarily initialized within a text region and progressively merges neighborhood regions based on the extracted local features by a CNN and contextual information of merged regions. The CSE is highly parameterized and can be seamlessly integrated into existing object detection frameworks. Enhanced by the data-dependent CSE mechanism, our curve text detection system provides robust instance-level text region extraction with minimal post-processing. The analysis experiment shows that our CSE can handle texts with various shapes, sizes, and orientations, and can effectively suppress the false-positives coming from text-like textures or unexpected texts included in the same RoI. Compared with the existing curve text detection algorithms, our method is more robust and enjoys a simpler processing flow. It also creates a new state-of-art performance on curve text benchmarks with F-score of up to 78.4%\%.Comment: This paper has been accepted by IEEE International Conference on Computer Vision and Pattern Recognition (CVPR 2019

    NA2\text{A}^\text{2}Q: Neural Attention Additive Model for Interpretable Multi-Agent Q-Learning

    Full text link
    Value decomposition is widely used in cooperative multi-agent reinforcement learning, however, its implicit credit assignment mechanism is not yet fully understood due to black-box networks. In this work, we study an interpretable value decomposition framework via the family of generalized additive models. We present a novel method, named Neural Attention Additive Q-learning~(NA2\text{A}^\text{2}Q), providing inherent intelligibility of collaboration behavior. NA2\text{A}^\text{2}Q can explicitly factorize the optimal joint policy induced by enriching shape functions to model all possible coalitions of agents into individual policies. Moreover, we construct identity semantics to promote estimating credits together with the global state and individual value functions, where local semantic masks help us diagnose whether each agent captures relevant-task information. Extensive experiments show that NA2\text{A}^\text{2}Q consistently achieves superior performance compared to different state-of-the-art methods on all challenging tasks, while yielding human-like interpretability

    Clock Synchronized Transmission of 51.2GBd Optical Packets for Optically Switched Data Center Interconnects

    Get PDF
    Optical switching has attracted significant attention in recent research on data center networks (DCNs) as it is a promising viable route for the further scaling of hyper scale data centers, so that DCNs can keep pace with the rapid growth of machine-to-machine traffic. It has been shown that optical clock synchronization enables sub-nanosecond clock and data recovery time and is crucial to high performance optically switched DCN. Moreover, the interconnect data rate is expected to increase from the current 100 Gb/s per fiber to scale to 800 Gb/s and beyond, requiring high baud rate signaling at >50 GBd. Thus, future optically switched DCN should support >50GBd data transmission with optical clock synchronization. Here, we demonstrate the clock- synchronized transmission of 128-byte optical packets at 51.2 GBd and study the impact of reference clock phase noise on system performance, focusing on the tolerance to the clock phase misalignment that affects the system scalability and reliability. By comparing the tolerable sampling clock phase offsets using different reference clocks, we show that a clock phase offset window of about 8ps could be achieved with a <0.2ps source clock. Furthermore, we model and numerically study the de- correlation of clock phase noise. This allows the total jitter to be estimated, and thereby, the estimation of the transmission performance for future generations of high baud rate, clock synchronized DC interconnects

    Experimental Demonstration of Multiband Comb-Enabled mm-Wave Transmission

    Get PDF
    A novel system architecture to realize multiple synchronized sources for multiband millimeter-wave (mm-Wave) transmission has been designed and experimentally demonstrated in two of the W -band subbands at 100 and 112.5 GHz. The technique converts a distributed electro-optic comb, generated off-site, to a local electrical comb. The higher frequency tones in the resulting comb are extracted and used as mm-Wave oscillator sources. Thus, the architecture provides a method to generate multiple frequency-synchronized sources, using only a single electronic oscillator, with exceptionally low phase noise for simultaneous multiband mm-Wave transmission. Additionally, a lower frequency tone at 6.25 GHz is broadcasted over the air, providing synchronization reference between the mm-Wave transmitters and receivers

    Reinforcement Learning for Self-exploration in Narrow Spaces

    Full text link
    In narrow spaces, motion planning based on the traditional hierarchical autonomous system could cause collisions due to mapping, localization, and control noises. Additionally, it is disabled when mapless. To tackle these problems, we leverage deep reinforcement learning which is verified to be effective in self-decision-making, to self-explore in narrow spaces without a map while avoiding collisions. Specifically, based on our Ackermann-steering rectangular-shaped ZebraT robot and its Gazebo simulator, we propose the rectangular safety region to represent states and detect collisions for rectangular-shaped robots, and a carefully crafted reward function for reinforcement learning that does not require the destination information. Then we benchmark five reinforcement learning algorithms including DDPG, DQN, SAC, PPO, and PPO-discrete, in a simulated narrow track. After training, the well-performed DDPG and DQN models can be transferred to three brand new simulated tracks, and furthermore to three real-world tracks

    Spatiotemporal Graph Neural Network based Mask Reconstruction for Video Object Segmentation

    Full text link
    This paper addresses the task of segmenting class-agnostic objects in semi-supervised setting. Although previous detection based methods achieve relatively good performance, these approaches extract the best proposal by a greedy strategy, which may lose the local patch details outside the chosen candidate. In this paper, we propose a novel spatiotemporal graph neural network (STG-Net) to reconstruct more accurate masks for video object segmentation, which captures the local contexts by utilizing all proposals. In the spatial graph, we treat object proposals of a frame as nodes and represent their correlations with an edge weight strategy for mask context aggregation. To capture temporal information from previous frames, we use a memory network to refine the mask of current frame by retrieving historic masks in a temporal graph. The joint use of both local patch details and temporal relationships allow us to better address the challenges such as object occlusion and missing. Without online learning and fine-tuning, our STG-Net achieves state-of-the-art performance on four large benchmarks (DAVIS, YouTube-VOS, SegTrack-v2, and YouTube-Objects), demonstrating the effectiveness of the proposed approach.Comment: Accepted by AAAI 202
    • …
    corecore